Goto

Collaborating Authors

 Quetta


Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences

El, Batu, Zou, James

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly shaping how information is created and disseminated, from companies using them to craft persuasive advertisements, to election campaigns optimizing messaging to gain votes, to social media influencers boosting engagement. These settings are inherently competitive, with sellers, candidates, and influencers vying for audience approval, yet it remains poorly understood how competitive feedback loops influence LLM behavior. We show that optimizing LLMs for competitive success can inadvertently drive misalignment. Using simulated environments across these scenarios, we find that, 6.3% increase in sales is accompanied by a 14.0% rise in deceptive marketing; in elections, a 4.9% gain in vote share coincides with 22.3% more disinformation and 12.5% more populist rhetoric; and on social media, a 7.5% engagement boost comes with 188.6% more disinformation and a 16.3% increase in promotion of harmful behaviors. We call this phenomenon Moloch's Bargain for AI--competitive success achieved at the cost of alignment. These misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded, revealing the fragility of current alignment safeguards. Our findings highlight how market-driven optimization pressures can systematically erode alignment, creating a race to the bottom, and suggest that safe deployment of AI systems will require stronger governance and carefully designed incentives to prevent competitive dynamics from undermining societal trust. There are clear economic and social incentives to optimize LLMs and AI agents for competitive markets: A company can increase its profits by generating more persuasive sales pitches, a candidate can capture a larger share of voters with sharper campaign messaging, and an influencer can boost engagement by producing more compelling social media content. In the presence of both the technology and the incentives, it is natural to expect adoption to move rapidly in this direction. In contrast, the incentives to ensure safety are far weaker. The costs of social hazards--such as deceptive product representation and disinformation on social media--are typically borne by the public rather than the organizations deploying these systems, who may be held accountable only when found legally liable. In this paper, we investigate the critical question: Can optimization for market success inadvertently produce misaligned LLMs? We experimentally show that misalignment consistently emerges from market competition across three different settings.


NewsEdits 2.0: Learning the Intentions Behind Updating News

Spangher, Alexander, Huang, Kung-Hsiang, Cho, Hyundong, May, Jonathan

arXiv.org Artificial Intelligence

As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts. In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can predict which facts in a news article will update using solely the text of a news article (i.e. not external resources like search engines). We test this hypothesis, first, by isolating fact-updates in large news revisions corpora. News articles may update for many reasons (e.g. factual, stylistic, narrative). We introduce the NewsEdits 2.0 taxonomy, an edit-intentions schema that separates fact updates from stylistic and narrative updates in news writing. We annotate over 9,200 pairs of sentence revisions and train high-scoring ensemble models to apply this schema. Then, taking a large dataset of silver-labeled pairs, we show that we can predict when facts will update in older article drafts with high precision. Finally, to demonstrate the usefulness of these findings, we construct a language model question asking (LLM-QA) abstention task. We wish the LLM to abstain from answering questions when information is likely to become outdated. Using our predictions, we show, LLM absention reaches near oracle levels of accuracy.


Generative AI-empowered Simulation for Autonomous Driving in Vehicular Mixed Reality Metaverses

Xu, Minrui, Niyato, Dusit, Chen, Junlong, Zhang, Hongliang, Kang, Jiawen, Xiong, Zehui, Mao, Shiwen, Han, Zhu

arXiv.org Artificial Intelligence

In the vehicular mixed reality (MR) Metaverse, the distance between physical and virtual entities can be overcome by fusing the physical and virtual environments with multi-dimensional communications in autonomous driving systems. Assisted by digital twin (DT) technologies, connected autonomous vehicles (AVs), roadside units (RSU), and virtual simulators can maintain the vehicular MR Metaverse via digital simulations for sharing data and making driving decisions collaboratively. However, large-scale traffic and driving simulation via realistic data collection and fusion from the physical world for online prediction and offline training in autonomous driving systems are difficult and costly. In this paper, we propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations for improving driving safety and traffic efficiency. First, we propose a multi-task DT offloading model for the reliable execution of heterogeneous DT tasks with different requirements at RSUs. Then, based on the preferences of AV's DTs and collected realistic data, virtual simulators can synthesize unlimited conditioned driving and traffic datasets to further improve robustness. Finally, we propose a multi-task enhanced auction-based mechanism to provide fine-grained incentives for RSUs in providing resources for autonomous driving. The property analysis and experimental results demonstrate that the proposed mechanism and architecture are strategy-proof and effective, respectively.


Pakistan police, kin seek murder charge over driver killed along with Taliban chief in U.S. drone strike

The Japan Times

QUETTA, PAKISTAN – The family of a driver who was killed alongside Taliban chief Mullah Akhtar Mansour in a U.S. drone strike in Pakistan has filed a case against U.S. officials, seeking to press murder charges, police said Sunday. Mansour had entered Pakistan from Iran using a false name and fake Pakistani identity documents on May 21, when his car was targeted by a U.S. drone. The driver, who was also killed, was later identified as Mohammed Azam. The police filed a case on behalf of Azam's family, police official Abdul Wakil Mengal said. It was not immediately clear what legal avenues the family can realistically pursue.


Senior Taliban figure says death of leader could unify group

U.S. News

A Pakistani police officer and paramedics stand beside two dead bodies reportedly killed in a U.S. drone strike in the Ahmad Wal area in Baluchistan province, Pakistan, at a hopsital in Quetta, Pakistan, Sunday, May 22, 2016. A senior commander of the Afghan Taliban confirmed on Sunday that the extremist group's leader, Mullah Mohammad Akhtar Mansour, had been killed in the strike.